<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>Eclipse Platform/Core</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" href="http://dev.eclipse.org/default_style.css" type="text/css">
</head>
<body>
<center>
	<font class=indextop>core</font><br>
	<font class=indexsub>the foundation of the platform</font><p></p>
	<a href="../main.html">[home]</a>
	<a href="../documents.html">[documents]</a>
	<a href="../downloads.html">[downloads]</a>
	<a href="../resources.html">[resources]</a>
	<a href="../planning.html">[planning]</a>
	<a href="../testing.html">[testing]</a>
</center>
<br>
<table BORDER=0 CELLPADDING=2 WIDTH="100%" >
	<tr> 
		<td ALIGN=LEFT VALIGN=TOP COLSPAN="2" BGCOLOR="#0080C0"><b><font face="Arial,Helvetica" color="#FFFFFF">History Store Modifications</font></b>			</td>
	</tr>
	<tr> 
		<td ALIGN=RIGHT VALIGN=TOP WIDTH="2%"><img SRC="../images/Adarrow.gif" BORDER=0 height=16 width=16></td>
    	<td WIDTH="98%"><b>Summary</b> 
		<p> The history store current implementation is said to be problem-prone 
        (data corruption) and hard to maintain. There is an ongoing effort to 
        replace it with something simpler. One of the options would be the adoption 
        of a third-party provided storage engine, such as a embedded database 
        or a b-tree engine. Another is to provide a very simple home-made solution 
        with as little (conceptual/memory/execution) overhead as possible. </p>
		
      <ul>
        <li> 
          <p><em>2004/10/29</em> - tested a file system based solution (<em>take 
            1</em>). Every project has its own history area. History for a file 
            will be stored in a relative directory similar to where the file is. 
            The difference is that, in order to prevent problems with path length 
            limitations on Windows, every segment is translated to a short fixed 
            length string (using a hash function). This opens opportunity for 
            colisions, and it is an one-way translation, so we need to remember 
            the original path for every state. We did this by keeping an additional 
            .info file for each file state to maintain that information. Result: 
            adding new state is really fast, but any tree traversals are very 
            costly (we have to open all .info files in every directory we look).</p>
        </li>
        <li> 
          <p><em>2004/11/02</em> - tested an improved solution (<em>take 2</em>) 
            based on the previous approach. Instead of keeping an individual .info 
            file showing the original location for every state, we keep a index 
            table per directory. This drastically reduces the number of file openings 
            (not mentioning it takes much less space in the file system), having 
            a better performance for tree based operations. The poor performance 
            for copying history is understandable since we are actually copying 
            the states as well, whereas the original implementation links them 
            (maybe we should do the same). The bad performance when clearing the 
            history for a given resource is due to having to delete multiple files/directories 
            in a more complex directory structure than the original implementation 
            (a two-level tree, directories are never deleted). We could improve 
            that by postponing deleting directories to shutdown (making shutdown 
            slower instead).</p>
        </li>
        <li> 
          <p><em>2004/11/17</em> - using some ideas from the previous tries and 
            the existing code, we got a solution (<em>take 3</em>) that seems 
            to have all the characteristics we have been looking for: it is much 
            simpler than the current story, and generally performs (sometimes 
            much) better than the current implementation. We are using the existing 
            BlobStore (as the existing implementation) and per-bucket indexes, 
            as in the previous solutions. These indexes are just dictionaries 
            having the file path as key and a pair &lt;UUID,timestamps&gt; for 
            every state belonging to the file as value.</p>
        </li>
        <li> 
          <p><em>2004/12/13</em> - the solution above (known as <em>take 3</em>) 
            has been part of recent integration builds and will be in for M4. 
            We are currently investigating its use as a replacement for the previous 
            property store implementation as well. There are concerns w.r.t. scalability 
            in using a single file to store all properties for all files/folders 
            under a given directory (since properties can be as big as 2K). Some 
            sort of segmentation strategy seems necessary to avoid such problems.</p>
        </li>
        <li><em>2005/02/02</em> - Eclipse 3.1 M5 - both history store and property 
          store implementations were moved to a new infrastructure. The existing 
          indexed store has been moved to a separate fragment, only used for supporting 
          backward compatibility with old workspaces. Performance tests are part 
          of the org.eclipse.core.tests.resources project.</li>
      </ul>
		</td>
	</tr>
	<tr> 
		<td ALIGN=RIGHT VALIGN=TOP WIDTH="2%"><img SRC="../images/Adarrow.gif" BORDER=0 height=16 width=16></td>
    	
    <td WIDTH="98%"><b>Performance Comparisons</b> 
    <p>Following is a comparison of performance (execution time) for the most 
        common use cases:</p>
    
      <table width="70%" border="1">
        <tr> 
          <td width="60%"><strong> 
            <div align="center">Use cases</div>
            </strong></td>
          <td width="20%"><strong> 
            <div align="center">Original (ms)</div>
            </strong></td>
          <td width="20%"><strong> 
            <div align="center">Take 3 (ms)</div>
            </strong></td>
        </tr>
        <tr> 
          <td><em>add state</em></td>
          <td align="center">
<div align="center">498</div></td>
          <td align="center">
<div align="center">481</div></td>
        </tr>
        <tr> 
          <td><em>get history for file</em></td>
          <td align="center">
<div align="center">1280</div></td>
          <td align="center">
<div align="center">16</div></td>
        </tr>
        <tr> 
          <td><em>find deleted members (100 files, 4 states per file)</em></td>
          <td align="center">914</td>
          <td align="center">46</td>
        </tr>
        <tr> 
          <td><em>find deleted members (20 files, 20 states per file)</em></td>
          <td align="center">906</td>
          <td align="center">15</td>
        </tr>
        <tr> 
          <td><em>find deleted members (4 files, 100 states per file)</em></td>
          <td align="center">
<div align="center">882</div></td>
          <td align="center">
<div align="center">15</div></td>
        </tr>
        <tr> 
          <td><em>clear history of a directory tree (100 files, 4 states per file)</em></td>
          <td align="center">367</td>
          <td align="center">42</td>
        </tr>
        <tr> 
          <td><em>clear history of a directory tree (20 files, 20 states per file)</em></td>
          <td align="center">332</td>
          <td align="center">187</td>
        </tr>
        <tr> 
          <td><em>clear history of a directory tree (4 files, 100 states per file)</em></td>
          <td align="center">
<div align="center">379</div></td>
          <td align="center">
<div align="center">179</div></td>
        </tr>
        <tr>
          <td><em>copy history for a directory tree (100 files, 4 states per file)</em></td>
          <td align="center">3350</td>
          <td align="center">3250</td>
        </tr>
        <tr> 
          <td><em>copy history for a directory tree (20 files, 20 states per file)</em></td>
          <td align="center">2210</td>
          <td align="center">868</td>
        </tr>
        <tr> 
          <td><em>copy history for a directory tree (4 files, 100 states per file)</em></td>
          <td align="center">4720</td>
          <td align="center">453</td>
        </tr>
        <tr> 
          <td><em>apply cleaning policy</em></td>
          <td align="center">
<div align="center">tbd</div></td>
          <td align="center">
<div align="center">tbd</div></td>
        </tr>
      </table> </td>
	</tr>
</table>


</body>
</html>